Nucleotide sequences and polypeptides encoded thereby useful for modifying plant characteristics

ABSTRACT

Isolated polynucleotides and polypeptides encoded thereby are described, together with the use of those products for making transgenic plants.

CROSS REFERENCE TO RELATED APPLICATION

This application is a Continuation of co-pending application Ser. No.11/241,607, filed on Sep. 30, 2005, the entire contents of which arehereby incorporated by reference and for which priority is claimed under35 U.S.C. §120.

Application Ser. No. 11/241,607 claims priority under 35 U.S.C. §119(e)on U.S. Provisional Application No(s). 60/615,270 filed on Sep. 30,2004, Application No. 60/638, 820 filed on Dec. 22, 2004, ApplicationNo. 60/637,210 filed on Dec. 16, 2004, Application No. 60/614,271 filedon Sep. 30, 2004, Application No. 60/614,332 filed on Sep. 30, 2004 andApplication No. 60/627,206 filed on Nov. 12, 2004, the entire contentsof which are hereby incorporated by reference.

FIELD OF THE INVENTION

The present invention relates to isolated polynucleotides, polypeptidesencoded thereby, and the use of those products for making transgenicplants or organisms, such as transgenic plants.

BACKGROUND OF THE INVENTION

There are more than 300,000 species of plants. They show a widediversity of forms, ranging from delicate liverworts, adapted for lifein a damp habitat, to cacti, capable of surviving in the desert. Theplant kingdom includes herbaceous plants, such as corn, whose life cycleis measured in months, to the giant redwood tree, which can live forthousands of years. This diversity reflects the adaptations of plants tosurvive in a wide range of habitats. This is seen most clearly in theflowering plants (phylum Angiospermophyta), which are the most numerous,with over 250,000 species. They are also the most widespread, beingfound from the tropics to the arctic.

The process of plant breeding involving man's intervention in naturalbreeding and selection is some 20,000 years old. It has producedremarkable advances in adapting existing species to serve new purposes.The world's economics was largely based on the successes of agriculturefor most of these 20,000 years.

Plant breeding involves choosing parents, making crosses to allowrecombination of gene (alleles) and searching for and selecting improvedforms. Success depends on the genes/alleles available, the combinationsrequired and the ability to create and find the correct combinationsnecessary to give the desired properties to the plant. Moleculargenetics technologies are now capable of providing new genes, newalleles and the means of creating and selecting plants with the new,desired characteristics.

Plants specifically improved for agriculture, horticulture, forestry andother industries (such as paper, bioconversion, textile, plants aschemical factories, etc.) can be obtained using molecular technologies.As an example, great agronomic value can result from modulating the sizeof a plant as a whole or of any of its organs. The green revolution cameabout as a result of creating dwarf wheat plants which produced a higherseed yield than taller plants because they could withstand higher levelsand inputs of fertilizer and water.

Similarly, modulation of the size and stature of an entire plant, or aparticular portion of a plant, allows production of plants better suitedfor a particular industry. For example, reductions in the height ofspecific ornamentals, crops and tree species can be beneficial byallowing easier harvesting. Alternatively, increasing height may bebeneficial by providing more biomass. Other examples of commerciallydesirable traits include increasing the length of the floral stems ofcut flowers, increasing or altering leaf size and shape, enhancing thesize of seeds and/or fruits, enhancing yields by specificallystimulating hormone (e.g. Brassinolide) synthesis and stimulating earlyflowering or evoking late flowering by altering levels of gibberellicacid or other hormones in specific cells. Changes in organ size andbiomass also result in changes in the mass of constituent molecules suchas secondary products.

To summarize, molecular genetic technologies provide the ability tomodulate and manipulate growth, development and biochemistry of theentire plant as well as at the cell, tissue and organ levels. Thus,plant morphology, development and biochemistry are altered to maximizeor minimize the desired plant trait.

SUMMARY OF THE INVENTION

The present invention, therefore, relates to isolated polynucleotides,polypeptides encoded thereby, and the use of those products for makingtransgenic organisms, such as plants, bacteria, yeast, fungi andmammals, depending upon the desired characteristics.

In the field of agriculture and forestry efforts are constantly beingmade to produce plants with improved characteristics, such as increasedoverall yield or increased yield of biomass or chemical components, inparticular in order to guarantee the supply of the constantly increasingworld population with food and to guarantee the supply of reproducibleraw materials. Conventionally, people try to obtain plants with anincreased yield by breeding, but this is time-consuming andlabor-intensive. Furthermore, appropriate breeding programs must beperformed for each relevant plant species.

Over the last two decades, progress has been made by the geneticmanipulation of plants. That is, by introducing into plants recombinantnucleic acid molecules and expressing them as exogenous genes or usingthem to silence endogenous genes within these plants. Such approacheshave the advantage of not usually being limited to one plant species,but being transferable to other plant species and other organisms aswell. EP-A 0 511 979, for example, discloses that the expression of aprokaryotic asparagine synthetase in plant cells inter alia leads to anincrease in biomass production. Similarly, WO 96/21737 describes theproduction of plants with increased yield from the expression ofderegulated or unregulated fructose-1,6-bisphosphatase due to anincreased rate of the photosynthesis. Nevertheless, there is still aneed for generally applicable processes that lead to improvedcharacteristics (such as yield) in relevant plants associated with awide array of industrial purposes.

BRIEF DESCRIPTION OF THE TABLES Table 1 Knock-in Table

The Knock-In Table presents the results of knock-in experiments whereinplants are grown from tissues transformed with a marker gene-containinginsert and phenotypes are ascertained from the transformed plants. Eachsection of the Table relating to information on a new transformantbegins with a heading “Knock-in phenotype in gene (cDNA_id):” followedby a number which represents the Ceres internal code for a proprietarycDNA sequence. The transformant described is prepared by proceduresdescribed herein and the marker gene-containing insert interrupts theCeres proprietary cDNA_id (corresponding to the cDNA_id in the Referenceand Sequence Tables) identified. The following information is presentedfor each section.

-   -   Clone ID—presents the clone number of the Ceres proprietary        clone that is the source of the cDNA_id.    -   Promoter—Identifies the promoter utilized.    -   Phenotype ID—represents an internal identification code.    -   Unique F1 plant ID—represents the internal code for the F1 plant        for which a phenotype is described.    -   Assay—presents the type of growth analyzed (e.g. soil gross        morphology), followed by the assay name which corresponds to the        type/location of the tissue that was observed, the name of the        assay conducted for which the result provided the identified        phenotype.    -   Phenotype—describes the phenotype noted for the F1 generation        transformant.    -   Notes—provide additional information on the described phenotype        for the transformant.

Each knock-in that represents a transformant with an interruption in theidentified cDNA_id may be correlated with more than one identifiedphenotype.

DETAILED DESCRIPTION OF THE INVENTION 1. Definitions

The following terms are utilized throughout this application:

Domain: Domains are fingerprints or signatures that can be used tocharacterize protein families and/or parts of proteins. Suchfingerprints or signatures can comprise conserved (1) primary sequence,(2) secondary structure, and/or (3) three-dimensional conformation.Generally, each domain has been associated with either a family ofproteins or motifs. Typically, these families and/or motifs have beencorrelated with specific in-vitro and/or in-vivo activities. A domaincan be any length, including the entirety of the sequence of a protein.Detailed descriptions of the domains, associated families and motifs,and correlated activities of the polypeptides of the instant inventionare described below. Usually, the polypeptides with designated domain(s)can exhibit at least one activity that is exhibited by any polypeptidethat comprises the same domain(s). Domains also define areas ofnon-coding sequences such as promoters and miRNAs.

Endogenous: The term “endogenous,” within the context of the currentinvention refers to any polynucleotide, polypeptide or protein sequencewhich is a natural part of a cell or organism regenerated from saidcell.

Exogenous: “Exogenous,” as referred to within, is any polynucleotide,polypeptide or protein sequence, whether chimeric or not, that isinitially or subsequently introduced into the genome of an individualhost cell or the organism regenerated from said host cell by any meansother than by a sexual cross. Examples of means by which this can beaccomplished are described below, and include Agrobacterium-mediatedtransformation (of dicots—e.g. Salomon et al. (1984) EMBO J. 3:141;Herrera-Estrella et al. (1983) EMBO J. 2:987; of monocots,representative papers are those by Escudero et al. (1996) Plant J.10:355; Ishida et al. (1996) Nature Biotechnology 14:745; May et al.(1995) Bio/Technology 13:486), biolistic methods (Armaleo et al. (1990)Current Genetics 17:97), electroporation, in planta techniques, and thelike. The term “exogenous” as used herein is also intended to encompassinserting a naturally found element into a non-naturally found location.

Gene: The term “gene,” as used in the context of the current invention,encompasses all regulatory and coding sequence contiguously associatedwith a single hereditary unit with a genetic function. Genes can includenon-coding sequences that modulate the genetic function that include,but are not limited to, those that specify polyadenylation,transcriptional regulation, DNA conformation, chromatin conformation,extent and position of base methylation and binding sites of proteinsthat control all of these. Genes comprised of “exons” (codingsequences), which may be interrupted by “introns” (non-codingsequences), encode proteins. A gene's genetic function may require onlyRNA expression or protein production, or may only require binding ofproteins and/or nucleic acids without associated expression. In certaincases, genes adjacent to one another may share sequence in such a waythat one gene will overlap the other. A gene can be found within thegenome of an organism, artificial chromosome, plasmid, vector, etc., oras a separate isolated entity.

Heterologous sequences: “Heterologous sequences” are those that are notoperatively linked or are not contiguous to each other in nature. Forexample, a promoter from corn is considered heterologous to anArabidopsis coding region sequence. Also, a promoter from a geneencoding a growth factor from corn is considered heterologous to asequence encoding the corn receptor for the growth factor. Regulatoryelement sequences, such as UTRs or 3′ end termination sequences that donot originate in nature from the same gene as the coding sequenceoriginates from, are considered heterologous to said coding sequence.Elements operatively linked in nature and contiguous to each other arenot heterologous to each other. On the other hand, these same elementsremain operatively linked but become heterologous if other fillersequence is placed between them. Thus, the promoter and coding sequencesof a corn gene expressing an amino acid transporter are not heterologousto each other, but the promoter and coding sequence of a corn geneoperatively linked in a novel manner are heterologous.

Homologous gene: In the current invention, “homologous gene” refers to agene that shares sequence similarity with the gene of interest. Thissimilarity may be in only a fragment of the sequence and oftenrepresents a functional domain such as, examples including withoutlimitation a DNA binding domain, a domain with tyrosine kinase activity,or the like. The functional activities of homologous genes are notnecessarily the same.

Misexpression: The term “misexpression” refers to an increase or adecrease in the transcription of a coding region into a complementaryRNA sequence as compared to the parental wild-type. This term alsoencompasses expression of a gene or coding region for a different timeperiod as compared to the wild-type and/or from a non-natural locationwithin the plant genome.

Percentage of sequence identity: “Percentage of sequence identity,” asused herein, is determined by comparing two optimally aligned sequencesover a comparison window, where the fragment of the polynucleotide oramino acid sequence in the comparison window may comprise additions ordeletions (e.g., gaps or overhangs) as compared to the referencesequence (which does not comprise additions or deletions) for optimalalignment of the two sequences. The percentage is calculated bydetermining the number of positions at which the identical nucleic acidbase or amino acid residue occurs in both sequences to yield the numberof matched positions, dividing the number of matched positions by thetotal number of positions in the window of comparison and multiplyingthe result by 100 to yield the percentage of sequence identity. Optimalalignment of sequences for comparison may be conducted by the localhomology algorithm of Smith and Waterman (1981) Add. APL. Math. 2:482,by the homology alignment algorithm of Needleman and Wunsch (1970) J.Mol. Biol. 48:443, by the search for similarity method of Pearson andLipman (1988) Proc. Natl. Acad. Sci. (USA) 85: 2444, by computerizedimplementations of algorithms such as GAP, BESTFIT, BLAST, PASTA, andTFASTA (Accelrys, Inc., 10188 Telesis Court, Suite 100 San Diego, Calif.92121) or by inspection. Typically, the default values of 5.00 for gapweight and 0.30 for gap weight length are used. The term “substantialsequence identity” between polynucleotide or polypeptide sequencesrefers to polynucleotide or polypeptide comprising a sequence that hasat least 80% sequence identity, preferably at least 85%, more preferablyat least 90% and most preferably at least 95%, even more preferably, atleast 96%, 97%, 98% or 99% sequence identity compared to a referencesequence using the programs.

Regulatory Sequence: The term “regulatory sequence,” as used in thecurrent invention, refers to any nucleotide sequence that influencestranscription or translation initiation and rate, and stability and/ormobility of the transcript or polypeptide product. Regulatory sequencesinclude, but are not limited to, promoters, promoter control elements,protein binding sequences, 5′ and 3′ UTRs, transcriptional start site,termination sequence, polyadenylation sequence, introns, certainsequences within a coding sequence, etc.

Stringency: “Stringency” as used herein is a function of probe length,probe composition (G+C content), and salt concentration, organic solventconcentration, and temperature of hybridization or wash conditions.Stringency is typically compared by the parameter T_(m), which is thetemperature at which 50% of the complementary molecules in thehybridization are hybridized, in terms of a temperature differentialfrom T_(m). High stringency conditions are those providing a conditionof T_(m) −5° C. to T_(m) −10° C. Medium or moderate stringencyconditions are those providing T_(m) −20° C. to T_(m) −29° C. Lowstringency conditions are those providing a condition of T_(m) −40° C.to T_(m) −48° C. The relationship of hybridization conditions to T_(m)(in ° C.) is expressed in the mathematical equation

T _(m)=81.5−16.6(log₁₀[Na⁺])+0.41(% G+C)−(600/N)  (1)

where N is the length of the probe. This equation works well for probes14 to 70 nucleotides in length that are identical to the targetsequence. The equation below for T_(m) of DNA-DNA hybrids is useful forprobes in the range of 50 to greater than 500 nucleotides, and forconditions that include an organic solvent (formamide).

T _(m)=81.5+16.6 log {[Na⁺]/(1+0.7[Na⁺])}+0.41(% G+C)−500/L 0.63(%formamide)  (2)

where L is the length of the probe in the hybrid. (P. Tijessen,“Hybridization with Nucleic Acid Probes” in Laboratory Techniques inBiochemistry and Molecular Biology, P. C. vand der Vliet, ed., c. 1993by Elsevier, Amsterdam.) The T_(m) of equation (2) is affected by thenature of the hybrid; for DNA-RNA hybrids T_(m) is 10-15° C. higher thancalculated, for RNA-RNA hybrids T_(m) is 20-25° C. higher. Because theT_(m) decreases about 1° C. for each 1% decrease in homology when a longprobe is used (Bonner et al. (1973) J. Mol. Biol. 81:123), stringencyconditions in polynucleotide hybridization reactions can be adjusted tofavor hybridization of polynucleotides from identical genes or relatedfamily members.

Equation (2) is derived assuming equilibrium and therefore,hybridizations according to the present invention are most preferablyperformed under conditions of probe excess and for sufficient time toachieve equilibrium. The time required to reach equilibrium can beshortened by inclusion of a hybridization accelerator such as dextransulfate or another high volume polymer in the hybridization buffer.

Stringency conditions can be selected during the hybridization reactionor after hybridization has occurred by altering the salt and temperatureconditions of the wash solutions used. The formulas shown above areequally valid when used to compute the stringency of a wash solution.Preferred wash solution stringencies lie within the ranges stated above;high stringency is 5-8° C. below T_(m), medium or moderate stringency is26-29° C. below T_(m) and low stringency is 45-48° C. below T_(m).

Substantially free of: A composition containing A is “substantially freeof” B when at least 85% by weight of the total A+B in the composition isA. Preferably, A comprises at least about 90% by weight of the total ofA+B in the composition, more preferably at least about 95% or even 99%by weight. For example, a plant gene or DNA sequence can be consideredsubstantially free of other plant genes or DNA sequences.

Translational start site: In the context of the current invention, a“translational start site” is usually an ATG in the cDNA transcript,more usually the first ATG. A single cDNA, however, may have multipletranslational start sites.

Transcription start site: “Transcription start site” is used in thecurrent invention to describe the point at which transcription isinitiated. This point is typically located about 25 nucleotidesdownstream from a TFIID binding site, such as a TATA box. Transcriptioncan initiate at one or more sites within the gene, and a single gene mayhave multiple transcriptional start sites, some of which may be specificfor transcription in a particular cell-type or tissue.

Untranslated region (UTR): A “UTR” is any contiguous series ofnucleotide bases that is transcribed, but is not translated. Theseuntranslated regions may be associated with particular functions such asincreasing mRNA message stability. Examples of UTRs include, but are notlimited to polyadenylation signals, terminations sequences, sequenceslocated between the transcriptional start site and the first exon (5′UTR) and sequences located between the last exon and the end of the mRNA(3′ UTR).

Variant: The term “variant” is used herein to denote a polypeptide orprotein or polynucleotide molecule that differs from others of its kindin some way. For example, polypeptide and protein variants can consistof changes in amino acid sequence and/or charge and/orpost-translational modifications (such as glycosylation, etc).

2. Important Characteristics of the Polynucleotides of the Invention

The genes and polynucleotides of the present invention are of interestbecause when they are misexpressed (i.e. when over expressed at anon-natural location or in an increased amount) or when they allowsilencing endogenous genes, they produce plants with important modifiedcharacteristics as discussed below. These traits can be used to exploitor maximize plant products or to minimize undesirable characteristics.For example, an increase in plant height is beneficial in species grownor harvested for their main stem or trunk, such as ornamental cutflowers, fiber crops (e.g. flax, kenaf, hesperaloe, hemp) and woodproducing trees. Increase in inflorescence thickness is also desirablefor some ornamentals, while increases in the number, shape and size ofleaves can lead to increased production/harvest from leaf crops such aslettuce, spinach, cabbage, switch grass and tobacco. Likewise, adecrease in plant height is beneficial in species that are particularlysusceptible to lodging or uprooting due to wind stress.

The polynucleotides and polypeptides of the invention were isolated fromdifferent plant species as noted in the Sequence Listing. Thepolynucleotides and polypeptides are useful to confer on transgenicplants the properties identified for each sequence in the relevantportion (miscellaneous feature section) of the Sequence Listing. Themiscellaneous feature section of the sequence listing contains, for eachsequence, a description of the domain or other characteristic from whichthe sequence has the function known in the art for other sequences. Someidentified domains are indicated with “PFam Name”, signifying that thepfam name and description can be found in the pfam database availablevia the internet. Other domains are indicated by reference to a “GINumber” from the public sequence database maintained by GenBank underthe NCBI, including the non-redundant (NR) database.

The sequences of the invention can be applied to substrates for use inmicroarray applications such as, but not limited to, assays of globalgene expression under varying development and growth conditions. Themicroarrays are also used for diagnostic or forensic purposes. Arrayscan be produced using different procedures such as those from Affymetrixor Agilent. Protocols for these procedures can be obtained from thesecompanies or found via the internet.

The polynucleotides, or fragments thereof, can also be used as probesand primers. Probe length varies depending on the application. For useas primers, probes are 12-40 nucleotides, preferably 18-30 nucleotideslong. For use in mapping, probes are preferably 50 to 500 nucleotides,preferably 100-250 nucleotides long. For Southern hybridizations, probesas long as several kilobases are used.

The probes and/or primers are produced by synthetic procedures such asthe triester method of Matteucci et al. (1981) J. Am. Chem. Soc.103:3185 or according to Urdea et al. (1981) Proc. Natl. Acad. 80:7461or using commercially available automated oligonucleotide synthesizers.

The polynucleotides of the invention can be utilized in a number ofmethods known to those skilled in the art as probes and/or primers toisolate and detect polynucleotides including, without limitation:Southerns, Northerns, Branched DNA hybridization assays, polymerasechain reaction microarray assays and variations thereof. Specificmethods given by way of examples, and discussed below include:

Hybridization

Methods of Mapping

Southern Blotting

Isolating cDNA from Related Organisms

Isolating and/or Identifying Homologous and Orthologous Genes. Also, thenucleic acid molecules of the invention can be used in other methods,such as high density oligonucleotide hybridizing assays, described, forexample, in U.S. Pat. Nos. 6,004,753 and 5,945,306.

The polynucleotides or fragments thereof of the present invention can beused as probes and/or primers for detection and/or isolation of relatedpolynucleotide sequences through hybridization. Hybridization of onenucleic acid to another constitutes a physical property that defines thepolynucleotide of the invention and the identified related sequences.Also, such hybridization imposes structural limitations on the pair. Agood general discussion of the factors for determining hybridizationconditions is provided by Sambrook et al. (“Molecular Cloning, aLaboratory Manual, 2nd ed., c. 1989 by Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.; see esp., chapters 11 and 12).Additional considerations and details of the physical chemistry ofhybridization are provided by G. H. Keller and M. M. Manak “DNA Probes”,2^(nd) Ed. pp. 1-25, c. 1993 by Stockton Press, New York, N.Y.

When using the polynucleotides to identify homologous genes in otherspecies, the practitioner will preferably adjust the amount of targetDNA of each species so that, as nearly as is practical, the same numberof genome equivalents are present for each species examined. Thisprevents faint signals from species having large genomes, and thus smallnumbers of genome equivalents per mass of DNA, from erroneously beinginterpreted as absence of the corresponding gene in the genome.

The probes and/or primers of the instant invention can also be used todetect or isolate nucleotides that are “identical” to the probes orprimers. Two nucleic acid sequences or polypeptides are said to be“identical” if the sequence of nucleotides or amino acid residues,respectively, in the two sequences is the same when aligned for maximumcorrespondence as described below.

Isolated polynucleotides within the scope of the invention also includeallelic variants of the specific sequences presented in the SequenceListing. The probes and/or primers of the invention are also used todetect and/or isolate polynucleotides exhibiting at least 80% sequenceidentity with the sequences of the Sequence Listing or fragmentsthereof. Related polynucleotide sequences can also be identifiedaccording to the methods described in U.S. Patent Publication20040137466A1, dated Jul. 15, 2004 to Jofuku et al.

With respect to nucleotide sequences, degeneracy of the genetic codeprovides the possibility to substitute at least one nucleotide of thenucleotide sequence of a gene with a different nucleotide withoutchanging the amino acid sequence of the polypeptide. Hence, the DNA ofthe present invention also has any base sequence that has been changedfrom a sequence in the Sequence Listing by substitution in accordancewith degeneracy of genetic code. References describing codon usageinclude: Carels et al. (1998) J. Mol. Evol. 46: 45 and Fennoy et al.(1993) Nucl. Acids Res. 21(23): 5294.

The polynucleotides of the invention are also used to create varioustypes of genetic and physical maps of the genome of the plant specieslisted in the Sequence Listing. Some are absolutely associated withparticular phenotypic traits, allowing construction of gross geneticmaps. Creation of such maps is based on differences or variants,generally referred to as polymorphisms, between different parents usedin crosses. Common methods of detecting polymorphisms that can be usedare restriction fragment length polymorphisms (RFLPs), single nucleotidepolymorphisms (SNPs) or simple sequence repeats (SSRs).

The use of RFLPs and of recombinant inbred lines for such geneticmapping is described for Arabidopsis by Alonso-Blanco et al. (Methods inMolecular Biology, vol. 82, “Arabidopsis Protocols”, pp. 137-146, J. M.Martinez-Zapater and J. Salinas, eds., c. 1998 by Humana Press, Totowa,N.J.) and for corn by Burr (“Mapping Genes with Recombinant Inbreds”,pp. 249-254. In Freeling, M. and V. Walbot (Ed.), The Maize Handbook, c.1994 by Springer-Verlag New York, Inc.: New York, N.Y., USA; BerlinGermany; Burr et al. Genetics (1998) 118: 519; Gardiner, J. et al.(1993) Genetics 134: 917). This procedure, however, is not limited toplants and is used for other organisms (such as yeast) or for individualcells.

The polynucleotides of the present invention are also used for simplesequence repeat (SSR) mapping. Rice SSR mapping is described by Morganteet al. (The Plant Journal (1993) 3: 165), Panaud et al. (Genome (1995)38: 1170); Senior et al. (Crop Science (1996) 36: 1676), Taramino et al.(Genome (1996) 39: 277) and Ahn et al. (Molecular and General Genetics(1993) 241: 483-90). SSR mapping is achieved using various methods. Inone instance, polymorphisms are identified when sequence specific probescontained within a polynucleotide flanking an SSR are made and used inpolymerase chain reaction (PCR) assays with template DNA from two ormore individuals of interest. Here, a change in the number of tandemrepeats between the SSR-flanking sequences produces differently sizedfragments (U.S. Pat. No. 5,766,847). Alternatively, polymorphisms areidentified by using the PCR fragment produced from the SSR-flankingsequence specific primer reaction as a probe against Southern blotsrepresenting different individuals (U. H. Refseth et al. (1997)Electrophoresis 18: 1519).

The polynucleotides of the invention can further be used to identifycertain genes or genetic traits using, for example, known AFLPtechnologies, such as in EP0534858 and U.S. Pat. No. 5,878,215.

The polynucleotides of the present invention are also used for singlenucleotide polymorphism (SNP) mapping.

Genetic and physical maps of crop species have many uses. For example,these maps are used to devise positional cloning strategies forisolating novel genes from the mapped crop species. In addition, becausethe genomes of closely related species are largely syntenic (i.e. theydisplay the same ordering of genes within the genome), these maps areused to isolate novel alleles from relatives of crop species bypositional cloning strategies.

The various types of maps discussed above are used with thepolynucleotides of the invention to identify Quantitative Trait Loci(QTLs). Many important crop traits, such as the solids content oftomatoes, are quantitative traits and result from the combinedinteractions of several genes. These genes reside at different loci inthe genome, often times on different chromosomes, and generally exhibitmultiple alleles at each locus. The polynucleotides of the invention areused to identify QTLs and isolate specific alleles as described by deVicente and Tanksley (Genetics (1993) 134:585). Once a desired allelecombination is identified, crop improvement is accomplished eitherthrough biotechnological means or by directed conventional breedingprograms (for review see Tanksley and McCouch (1997) Science 277:1063).In addition to isolating QTL alleles in present crop species, thepolynucleotides of the invention are also used to isolate alleles fromthe corresponding QTL of wild relatives.

In another embodiment, the polynucleotides are used to help createphysical maps of the genome of the plant species mentioned in theSequence Listing and related species thereto. Where polynucleotides areordered on a genetic map, as described above, they are used as probes todiscover which clones in large libraries of plant DNA fragments in YACs,BACs, etc. contain the same polynucleotide or similar sequences, therebyfacilitating the assignment of the large DNA fragments to chromosomalpositions. Subsequently, the large BACs, YACs, etc. are orderedunambiguously by more detailed studies of their sequence composition(e.g. Marra et al. (1997) Genomic Research 7:1072-1084) and by usingtheir end or other sequences to find the identical sequences in othercloned DNA fragments. The overlapping of DNA sequences in this wayallows building large contigs of plant sequences to be built that, whensufficiently extended, provide a complete physical map of a chromosome.Sometimes the polynucleotides themselves provide the means of joiningcloned sequences into a contig. All scientific and patent publicationscited in this paragraph are hereby incorporated by reference.

U.S. Pat. Nos. 6,287,778 and 6,500,614, both hereby incorporated byreference, describe scanning multiple alleles of a plurality of lociusing hybridization to arrays of oligonucleotides. These techniques areuseful for each of the types of mapping discussed above.

Following the procedures described above and using a plurality of thepolynucleotides of the present invention, any individual is genotyped.These individual genotypes are used for the identification of particularcultivars, varieties, lines, ecotypes and genetically modified plants orcan serve as tools for subsequent genetic studies involving multiplephenotypic traits.

Identification and isolation of orthologous genes from closely relatedspecies and alleles within a species is particularly desirable becauseof their potential for crop improvement. Many important crop traitsresult from the combined interactions of the products of several genesresiding at different loci in the genome. Generally, alleles at each ofthese loci make quantitative differences to the trait. Once a morefavorable allele combination is identified, crop improvement isaccomplished either through biotechnological means or by directedconventional breeding programs (Tanksley et al. (1997) Science277:1063).

3. Use of the Genes to Make Transgenic Plants

To use the sequences of the present invention or a combination of themor parts and/or mutants and/or fusions and/or variants of them,recombinant DNA constructs are prepared which comprise thepolynucleotide sequences of the invention inserted into a vector, andwhich are suitable for transformation of plant cells. The construct ismade using standard recombinant DNA techniques (Sambrook et al. 1989)and is introduced to the species of interest by Agrobacterium-mediatedtransformation or by other means of transformation as referenced below.

The sequences of the present invention can be in sense orientation or inanti-sense orientation.

If a decrease in the transcription or translation product of anendogenous gene (gene silencing) is desired, the sequence of interest istranscribed as an antisense nucleic acid or an interfering RNA similaror identical to part of the endogenous gene. Antisense nucleic acids orinterfering RNAs are about 10 nucleotides to about 2,500 nucleotides inlength. For example, the nucleic acid of the present invention can beused as an antisense nucleic acid to its corresponding endogenous gene.Alternatively, the transcription product of a nucleic acid of theinvention can be similar or identical to the sense coding sequence ofits corresponding endogenous gene, but is an RNA that isunpolyadenylated, lacks a 5′ cap structure, or contains an unsplicableintron. The nucleic acid of the present invention in sense orientationcan also be used as a partial or full-length coding sequence thatresults in inhibition of the expression of an endogenous polypeptide byco-suppression. Methods of co-suppression using a full-length cDNAsequence as well as a partial cDNA sequence are known in the art (see,for example, U.S. Pat. No. 5,231,020).

Alternatively, a nucleic acid can be transcribed into a ribozyme thataffects expression of an mRNA (see U.S. Pat. No. 6,423,885).Heterologous nucleic acids can encode ribozymes designed to cleaveparticular mRNA transcripts, thus preventing expression of apolypeptide. Hammerhead ribozymes are useful for destroying particularmRNAs, although various ribozymes that cleave mRNA at site-specificrecognition sequences can be used. Hammerhead ribozymes cleave mRNAs atlocations dictated by flanking regions that form complementary basepairs with the target mRNA. The sole requirement is that the target RNAcontains a 5′-UG-3′ nucleotide sequence. The construction and productionof hammerhead ribozymes is known in the art (see, for example, U.S. Pat.No. 5,254,678). Hammerhead ribozyme sequences can be embedded in astable RNA such as a transfer RNA (tRNA) to increase cleavage efficiencyin vivo (Perriman et al. (1995) Proc. Natl. Acad. Sci. USA,92(13):6175-6179; de Feyter and Gaudron Methods in Molecular Biology,Vol. 74, Chapter 43, “Expressing Ribozymes in Plants”, Edited by Turner,P. C, Humana Press Inc., Totowa, N.J.). RNA endoribonucleases such asthe one that occurs naturally in Tetrahymena thermophila and which havebeen described extensively by Cech and collaborators can also be useful(see, for example, U.S. Pat. No. 4,987,071).

A nucleic acid of the present invention can also be used for itstranscription into an interfering RNA. Such an RNA can be one that cananneal to itself, for example a double stranded RNA having a stem-loopstructure. One strand of the stem portion of a double stranded RNA cancomprise a sequence that is similar or identical to the sense codingsequence of an endogenous polypeptide and that is about 10 nucleotidesto about 2,500 nucleotides in length. Generally, the length of thenucleic acid sequence that is similar or identical to the sense codingsequence can be from 10 nucleotides to 500 nucleotides, from 15nucleotides to 300 nucleotides, from 20 nucleotides to 100 nucleotides,or from 25 nucleotides to 100 nucleotides. The other strand of the stemportion of a double stranded RNA can comprise an antisense sequence ofan endogenous polypeptide and can have a length that is shorter, thesame as, or longer than the length of the corresponding sense sequence.The loop portion of a double stranded RNA can be from 10 nucleotides to500 nucleotides in length, for example from 15 nucleotides to 100nucleotides, from 20 nucleotides to 300 nucleotides or from 25nucleotides to 400 nucleotides in length. The loop portion of the RNAcan include an intron (see, for example the following publications: WO98/53083; WO 99/32619; WO 98/36083; WO 99/53050; US 20040214330; US20030180945; U.S. Pat. No. 5,034,323; U.S. Pat. No. 6,452,067; U.S. Pat.No. 6,777,588; U.S. Pat. No. 6,573,099 and U.S. Pat. No. 6,326,527).Interfering RNA also can be constructed as described in Brummell, et al.(2003) Plant J. 33:793-800.

The vector backbone for the recombinant constructs is any of thosetypical in the art such as plasmids (such as Ti plasmids), viruses,artificial chromosomes, BACs, YACs and PACs and vectors of the sortdescribed by

-   (a) BAC: Shizuya et al. (1992) Proc. Natl. Acad. Sci. USA 89:    8794-8797; Hamilton et al. (1996) Proc. Natl. Acad. Sci. USA 93:    9975-9979;-   (b) YAC: Burke et al. (1987) Science 236:806-812;-   (c) PAC: Sternberg N. et al. (1990) Proc Natl Acad Sci USA. January;    87(1):103-7;-   (d) Bacteria-Yeast Shuttle Vectors: Bradshaw et al. (1995) Nucl    Acids Res 23: 4850-4856;-   (e) Lambda Phage Vectors: Replacement Vector, e.g., Frischauf et    al. (1983) J. Mol Biol 170: 827-842; or Insertion vector, e.g.,    Huynh et al., In: Glover N M (ed) DNA Cloning: A practical Approach,    Vol. 1 Oxford: IRL Press (1985); T-DNA gene fusion vectors: Walden    et al. (1990) Mol Cell Biol 1: 175-194; and-   (g) Plasmid vectors: Sambrook et al., infra.

Typically, the construct comprises a vector containing a sequence of thepresent invention with any desired transcriptional and/or translationalregulatory sequences, such as promoters, UTRs, and 3′ end terminationsequences. Vectors can also include origins of replication, scaffoldattachment regions (SARs), markers, homologous sequences, introns, etc.The vector may also comprise a marker gene that confers a selectablephenotype on plant cells. The marker may encode biocide resistance,particularly antibiotic resistance, such as resistance to kanamycin,G418, bleomycin, hygromycin, or herbicide resistance, such as resistanceto chlorosulfuron, glyphosate or phosphinotricin.

A plant promoter fragment is used that directs transcription of the genein all tissues of a regenerated plant and/or is a constitutive promoter.Alternatively, the plant promoter directs transcription of a sequence ofthe invention in a specific tissue (tissue-specific promoter) or isotherwise under more precise environmental control, such as chemicals,cold, heat, drought, salt and many others (inducible promoter).

If proper polypeptide production is desired, a polyadenylation region atthe 3′-end of the coding region is typically included. Thepolyadenylation region is derived from the natural gene, from a varietyof other plant genes, or from T-DNA, synthesized in the laboratory.

Transformation

Techniques for transforming a wide variety of higher plant species arewell known and described in the technical and scientific literature.See, e.g. Weising et al. (1988) Ann. Rev. Genet. 22:421 and Christou(1995) Euphytica, v. 85, n.1-3:13-27.

The person skilled in the art knows processes for the transformation ofmonocotyledonous and dicotyledonous plants. A variety of techniques areavailable for introducing DNA into a plant host cell. These techniquescomprise transformation of plant cells by DNA injection, DNAelectroporation, use of bolistics methods, protoplast fusion and viaT-DNA using Agrobacterium tumefaciens or Agrobacterium rhizogenes, aswell as further possibilities, or other bacterial hosts for Ti plasmidvectors. See for example, Broothaerts et al. (2005) Gene Transfer toPlants by Diverse Species of Bacteria, Nature, Vol. 433, pp. 629-633.

DNA constructs of the invention are introduced into the cell or thegenome of the desired plant host by a variety of conventionaltechniques. For example, the DNA construct is introduced usingtechniques such as electroporation, microinjection and polyethyleneglycol precipitation of plant cell protoplasts or protoplast fusion.Electroporation techniques are described in Fromm et al. (1985) Proc.Natl Acad. Sci. USA 82:5824. Microinjection techniques are known in theart and well described in the scientific and patent literature. Theplasmids do not have to fulfill specific requirements for use in DNAelectroporation or DNA injection into plant cells. Simple plasmids suchas pUC derivatives can be used.

The introduction of DNA constructs using polyethylene glycolprecipitation is described in Paszkowski et al. (1984) EMBO J. 3:2717.Introduction of foreign DNA using protoplast fusion is described byWillmitzer (Willmitzer, L. (1993) Transgenic plants. In: Biotechnology,A Multi-Volume Comprehensive Treatise (H. J. Rehm, G. Reed, A. Pühler,P. Stadler, eds.), Vol. 2, 627-659, VCH Weinheim-NewYork-Basel-Cambridge).

Alternatively, the DNA constructs of the invention are introduceddirectly into plant tissue using ballistic methods, such as DNA particlebombardment. Ballistic transformation techniques are described in Kleinet al. (1987) Nature 327:773. Introduction of foreign DNA usingballistics is described by Willmitzer (Willmitzer, L., 1993 Transgenicplants. In: Biotechnology, A Multi-Volume Comprehensive Treatise (H. J.Rehm, G. Reed, A. Pühler, P. Stadler, eds.), Vol. 2, 627-659, VCHWeinheim-New York-Basel-Cambridge).

DNA constructs are also introduced with the help of Agrobacteria. Theuse of Agrobacteria for plant cell transformation is extensivelyexamined and sufficiently disclosed in the specification of EP-A 120516, and in Hoekema (In: The Binary Plant Vector System OffsetdrukkerijKanters B. V., Alblasserdam (1985), Chapter V), Fraley et al. (Crit.Rev. Plant. Sci. 4, 1-46) and DePicker et al. (EMBO J. 4 (1985),277-287). Using this technique, the DNA constructs of the invention arecombined with suitable T-DNA flanking regions and introduced into aconventional Agrobacterium tumefaciens host vector. The virulencefunctions of the Agrobacterium tumefaciens host direct the insertion ofthe construct and adjacent marker(s) into the plant cell DNA when thecell is infected by the bacteria (McCormac et al. (1997) Mol.Biotechnol. 8:199; Hamilton (1997) Gene 200:107; Salomon et al. (1984)EMBO J. 3:141; Herrera-Estrella et al. (1983) EMBO J. 2:987).Agrobacterium tumefaciens-mediated transformation techniques, includingdisarming and use of binary or co-integrate vectors, are well describedin the scientific literature. See, for example Hamilton (1997) Gene200:107; Müller et al. (1987) Mol. Gen. Genet. 207:171; Komari et al.(1996) Plant J. 10:165; Venkateswarlu et al. (1991) Biotechnology 9:1103and Gleave (1992) Plant Mol. Biol. 20:1203; Graves and Goldman (1986)Plant Mol. Biol. 7:34 and Gould et al. (1991) Plant Physiology 95:426.

For plant cell T-DNA transfer of DNA, plant organs, e.g. infloresences,plant explants, plant cells that have been cultured in suspension orprotoplasts are co-cultivated with Agrobacterium tumefaciens orAgrobacterium rhizogenes or other suitable T-DNA hosts. Whole plants areregenerated from the infected plant material or seeds generated frominfected plant material using a suitable medium that containsantibiotics or biocides for the selection of transformed cells or byspraying the biocide on plants to select the transformed plants. Plantsobtained in this way are then examined for the presence of the DNAintroduced. The transformation of dicotyledonous plants viaTi-plasmid-vector systems and Agrobacterium tumefaciens is wellestablished.

Monocotyledonous plants are also transformed by means of Agrobacteriumbased vectors (See Chan et al. (1993) Plant Mol. Biol. 22: 491-506; Hieiet al. (1994) Plant J. 6:271-282; Deng et al. (1990) Science in China33:28-34; Wilmink et al. Plant (1992) Cell Reports 11:76-80; May et al.(1995) Bio/Technology 13:486-492; Conner and Domisse (1992) Int. J.Plant Sci. 153:550-555; Ritchie et al. (1993) Transgenic Res.2:252-265). Maize transformation in particular is described in theliterature (see, for example, WO95/06128, EP 0 513 849; EP 0 465 875;Fromm et al., (1990) Biotechnology 8:833-844; Gordon-Kamm et al. (1990)Plant Cell 2:603-618; Koziel et al. (1993) Biotechnology 11:194-200). InEP 292 435 and in Shillito et al. (Bio/Technology (1989) 7:581) fertileplants are obtained from a mucus-free, soft (friable) maize callus.Prioli and Söndahl (Bio/Technology (1989) 7, 589) also reportregenerating fertile plants from maize protoplasts of the maize Catetoinbred line, Cat 100-1.

Other cereal species have also been successfully transformed, such asbarley (Wan and Lemaux, see above; Ritala et al., see above) and wheat(Nehra et al. (1994) Plant J. 5, 285-297).

Alternatives to Agrobacterium transformation for plants are ballistics,protoplast fusion, electroporation of partially permeabilized cells anduse of glass fibers (See Wan and Lemaux (1994) Plant Physiol. 104:37-48;Vasil et al. (1993) Bio/Technology 11:1553-1558; Ritala et al. (1994)Plant Mol. Biol. 24:317-325; Spencer et al. (1990) Theor. Appl. Genet.79:625-631).

Introduced DNA is usually stable after integration into the plant genomeand is transmitted to the progeny of the transformed cell or plant.Generally the transformed plant cell contains a selectable marker thatmakes the transformed cells resistant to a biocide or an antibiotic suchas kanamycin, G 418, bleomycin, hygromycin, phosphinotricin or others.Therefore, the individually chosen marker should allow the selection oftransformed cells from cells lacking the introduced DNA.

The transformed cells grow within the plant in the usual way (McCormicket al. (1986) Plant Cell Reports 5, 81-84) and the resulting plants arecultured normally. Transformed plant cells obtained by any of the abovetransformation techniques are cultured to regenerate a whole plant thatpossesses the transformed genotype and thus the desired phenotype. Suchregeneration techniques rely on manipulation of certain phytohormones ina tissue culture growth medium, typically relying on a biocide and/orherbicide marker that has been introduced together with the desirednucleotide sequences.

Plant regeneration from cultured protoplasts is described in Evans etal., Protoplasts Isolation and Culture in “Handbook of Plant CellCulture,” pp. 124-176, MacMillan Publishing Company, New York, 1983; andBinding, Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRCPress, Boca Raton, 1988. Regeneration also occurs from plant callus,explants, organs, or parts thereof. Such regeneration techniques aredescribed generally in Klee et al. (1987) Ann. Rev. of Plant Phys.38:467. Regeneration of monocots (rice) is described by Hosoyama et al.(Biosci. Biotechnol. Biochem. (1994) 58:1500) and by Ghosh et al. (J.Biotechnol. (1994) 32:1). Useful and relevant procedures for transientexpression are also described in U.S. Application No. 60/537,070 filedon Jan. 16, 2004 and PCT Application No. PCT/US2005/001153 filed on Jan.14, 2005.

After transformation, seeds are obtained from the plants and used fortesting stability and inheritance. Generally, two or more generationsare cultivated to ensure that the phenotypic feature is stablymaintained and transmitted.

One of skill will recognize that after the expression cassette is stablyincorporated in transgenic plants and confirmed to be operable, it canbe introduced into other plants by sexual crossing. Any of a number ofstandard breeding techniques can be used, depending upon the species tobe crossed.

The nucleotide sequences according to the invention generally encode anappropriate protein from any organism, in particular from plants, fungi,bacteria or animals. The sequences preferably encode proteins fromplants or fungi. Preferably, the plants are higher plants, in particularstarch or oil storing useful plants, such as potato or cereals such asrice, maize, wheat, barley, rye, triticale, oat, millet, etc., as wellas spinach, tobacco, sugar beet, soya, cotton etc.

In principle, the process according to the invention can be applied toany plant. Therefore, monocotyledonous as well as dicotyledonous plantspecies are particularly suitable. The process is preferably used withplants that are interesting for agriculture, horticulture, biomass forconversion, textile, plants as chemical factories and/or forestry.

Thus, the invention has use over a broad range of plants, preferablyhigher plants, pertaining to the classes of Angiospermae andGymnospermae. Plants of the subclasses of the Dicotylodenae and theMonocotyledonae are particularly suitable. Dicotyledonous plants belongto the orders of the Magniolales, Illiciales, Laurales, PiperalesAristochiales, Nymphaeales, Ranunculales, Papeverales, Sarraceniaceae,Trochodendrales, Hamamelidales, Eucomiales, Leitneriales, Myricales,Fagales, Casuarinales, Caryophyllales, Batales, Polygonales,Plumbaginales, Dilleniales, Theales, Malvales, Urticales, Lecythidales,Violales, Salicales, Capparales, Ericales, Diapensales, Ebenales,Primulales, Rosales, Fabales, Podostemales, Haloragales, Myrtales,Cornales, Proteales, Santales, Rafflesiales, Celastrales, Euphorbiales,Rhamnales, Sapindales, Juglandales, Geraniales, Polygalales, Umbellales,Gentianales, Polemoniales, Lamiales, Plantaginales, Scrophulariales,Campanulales, Rubiales, Dipsacales, and Asterales. Monocotyledonousplants belong to the orders of the Alismatales, Hydrocharitales,Najadales, Triuridales, Commelinales, Eriocaulales, Restionales, Poales,Juncales, Cyperales, Typhales, Bromeliales, Zingiberales, Arecales,Cyclanthales, Pandanales, Arales, Lilliales, and Orchidales. Plantsbelonging to the class of the Gymnospermae are Pinales, Ginkgoales,Cycadales and Gnetales.

Examples of species represented in these orders are tobacco, oilseedrape, sugar beet, potato, tomato, lettuce, cucumber, pepper, bean, pea,citrus fruit, apple, pear, berries, plum, melon, eggplant, cotton,soybean, sunflower, rose, poinsettia, petunia, guayule, cabbage,spinach, alfalfa, artichoke, corn, wheat, rye, barley, grasses such asswitch grass or turf grass, millet, hemp, banana, poplar, eucalyptustrees, conifers.

The invention being thus described, it will be apparent to one ofordinary skill in the art that various modifications of the materialsand methods for practicing the invention can be made. Such modificationsare to be considered within the scope of the invention as defined by thefollowing claims.

Each of the references from the patent and periodical literature citedherein is hereby expressly incorporated in its entirety by suchcitation.

1. An isolated nucleic acid molecule comprising: a) a nucleic acidhaving a nucleotide sequence which encodes an amino acid sequenceexhibiting at least 85% sequence identity to an amino acid sequence inSequence Listing; b) a nucleic acid which is a complement of anucleotide sequence according to paragraph (a); c) a nucleic acid whichis the reverse of the nucleotide sequence according to subparagraph (a),such that the reverse nucleotide sequence has a sequence order which isthe reverse of the sequence order of the nucleotide sequence accordingto subparagraph (a); d) a nucleic acid which is an interfering RNA tothe nucleotide sequence according to subparagraph (a); or e) a nucleicacid capable of hybridizing to a nucleic acid according to any one ofparagraphs (a)-(c), under conditions that permit formation of a nucleicacid duplex at a temperature from about 40° C. and 48° C. below themelting temperature of the nucleic acid duplex.
 2. The isolated nucleicacid molecule according to claim 1, which has a nucleotide sequenceaccording to any polynucleotide sequence in the Sequence Listing.
 3. Theisolated nucleic acid molecule according to claim 1, wherein said aminoacid sequence comprises any polypeptide sequence in the SequenceListing.
 4. A vector construct comprising: a) a first nucleic acidhaving a regulatory sequence capable of causing transcription and/ortranslation in a plant; and b) a second nucleic acid having the sequenceof the isolated nucleic acid molecule according to claim 1; wherein saidfirst and second nucleic acids are operably linked.
 5. The vectorconstruct according to claim 4, wherein said first nucleic acid isnative to said second nucleic acid.
 6. The vector construct according toclaim 4, wherein said first nucleic acid is heterologous to said secondnucleic acid.
 7. A host cell comprising an isolated nucleic acidmolecule according to claim 1 wherein said nucleic acid molecule isflanked by exogenous sequence.
 8. A host cell comprising a vectorconstruct according to claim
 4. 9. An isolated polypeptide comprising anamino acid sequence exhibiting at least 85% sequence identity of anamino acid sequence of the Sequence Listing.
 10. A method of introducingan isolated nucleic acid into a host cell comprising: a) providing anisolated nucleic acid molecule according to claim 1; and b) contactingsaid isolated nucleic with said host cell under conditions that permitinsertion of said nucleic acid into said host cell.
 11. A method oftransforming a host cell which comprises contacting a host cell with avector construct according to claim
 4. 12. A method for detecting anucleic acid in a sample which comprises: a) providing an isolatednucleic acid molecule according to claim 1; b) contacting said isolatednucleic acid molecule with a sample under conditions which permit acomparison of the sequence of said isolated nucleic acid molecule withthe sequence of DNA in said sample; and c) analyzing the result of saidcomparison.
 13. A plant, plant cell, plant material or seed of a plantwhich comprises a nucleic acid molecule according to claim 1 which isexogenous or heterologous to said plant or plant cell.
 14. A plant,plant cell, plant material or seed of a plant which comprises a vectorconstruct according to claim
 4. 15. A plant that has been regeneratedfrom a plant cell or seed according to claim 13 or 14.